摘要 :
Existing knowledge-grounded conversation systems generate responses typically in a retrieve-then-generate manner. They require a large knowledge base and a strong knowledge retrieval component, which is time- and resource-consumin...
展开
Existing knowledge-grounded conversation systems generate responses typically in a retrieve-then-generate manner. They require a large knowledge base and a strong knowledge retrieval component, which is time- and resource-consuming. In this paper, we address the challenge by leveraging the inherent knowledge encoded in the pre-trained language models (PLMs). We propose Knowledgeable Prefix Tuning (KnowPrefix-Tuning), a two-stage tuning framework, bypassing the retrieval process in a knowledge-grounded conversation system by injecting prior knowledge into the lightweight knowledge prefix. The knowledge prefix is a sequence of continuous knowledge-specific vectors that can be learned during training. In addition, we propose a novel interactive re-parameterization mechanism that allows the prefix to interact fully with the PLM during the optimization of response generation. Experimental results demonstrate that KnowPrefix-Tuning outperforms fine-tuning and other lightweight tuning approaches, and performs comparably with strong retrieval-based baselines while being 3× faster during inference.
收起
摘要 :
Out-of-Domain (OOD) detection aims to identify whether a query falls outside the predefined intent set, which is crucial to maintaining high reliability and improving user experience in a task-oriented dialogue system. The key cha...
展开
Out-of-Domain (OOD) detection aims to identify whether a query falls outside the predefined intent set, which is crucial to maintaining high reliability and improving user experience in a task-oriented dialogue system. The key challenge is how to learn discriminative intent representations that are beneficial for distinguishing in-domain (IND) and OOD intents. However, previous methods ignore the compactness between instances and dispersion among categories, which limits the OOD detection performance. In this paper, we propose a novel Hybrid Contrastive Learning (HybridCL) framework to model both intra-class and inter-class constraints in OOD detection. Specifically, we first propose an intra-class constraint contrastive learning (Intra-CCL) objective, which encourages instances with the same class to be close to their prototypes, forming compact clusters. Then, we present an inter-class constraint contrastive learning (Inter-CCL) objective to effectively enlarge the discrepancy among different classes as much as possible, enforcing the strong separability for different classes in the intent embedding space. Besides, to further enhance the discriminative representation capability of the encoder, we employ an intent-wise attention mechanism to capture the relationships between the intents and the corresponding labels. Experiments and analysis on two public benchmark datasets show the effectiveness of our approach.
收起
摘要 :
Retrieval-augmented generative models have shown promising results in knowledge-grounding dialogue systems. However, identifying and utilizing exact knowledge from multiple passages based on dialogue context remains challenging du...
展开
Retrieval-augmented generative models have shown promising results in knowledge-grounding dialogue systems. However, identifying and utilizing exact knowledge from multiple passages based on dialogue context remains challenging due to the semantic dependency of the dialogue context. Existing research has observed that increasing the number of retrieved passages promotes the recall of relevant knowledge, but the performance of response generation improvement becomes marginal or even worse when the number reaches a certain threshold. In this paper, we present a multi-grained knowledge grounding identification method, in which the coarse-grained selects the most relevant knowledge from each retrieval passage separately, and the fine-grained refines the coarse-grained and identifies final knowledge as grounding in generation stage. To further guide the response generation with predicted grounding, we introduce a grounding-augmented copy mechanism in the decoding stage of dialogue generation. Empirical results on Mul-tiDoc2Dial and WoW benchmarks show that our method outperforms state-of-the-art methods.
收起
摘要 :
Joint entity and relation extraction has received increasing interests recently, due to the capability of utilizing the interactions between both steps. Among existing studies, the Multi-Head Selection (MHS) framework is efficient...
展开
Joint entity and relation extraction has received increasing interests recently, due to the capability of utilizing the interactions between both steps. Among existing studies, the Multi-Head Selection (MHS) framework is efficient in extracting entities and relations simultaneously. However, the method is weak for its limited performance. In this paper, we propose several effective insights to address this problem. First, we propose an entity-specific Relative Position Representation (eRPR) to allow the model to fully leverage the distance information between entities and context tokens. Second, we introduce an auxiliary Global Relation Classification (GRC) to enhance the learning of local contextual features. Moreover, we improve the semantic representation by adopting a pre-trained language model BERT as the feature encoder. Finally, these new keypoints are closely integrated with the multi-head selection framework and optimized jointly. Extensive experiments on two benchmark datasets demonstrate that our approach overwhelmingly outperforms previous works in terms of all evaluation metrics, achieving significant improvements for relation F1 by +2.40% on CoNLL04 and + 1.90% on ACE05, respectively.
收起
摘要 :
Joint entity and relation extraction has received increasing interests recently, due to the capability of utilizing the interactions between both steps. Among existing studies, the Multi-Head Selection (MHS) framework is efficient...
展开
Joint entity and relation extraction has received increasing interests recently, due to the capability of utilizing the interactions between both steps. Among existing studies, the Multi-Head Selection (MHS) framework is efficient in extracting entities and relations simultaneously. However, the method is weak for its limited performance. In this paper, we propose several effective insights to address this problem. First, we propose an entity-specific Relative Position Representation (eRPR) to allow the model to fully leverage the distance information between entities and context tokens. Second, we introduce an auxiliary Global Relation Classification (GRC) to enhance the learning of local contextual features. Moreover, we improve the semantic representation by adopting a pre-trained language model BERT as the feature encoder. Finally, these new keypoints are closely integrated with the multi-head selection framework and optimized jointly. Extensive experiments on two benchmark datasets demonstrate that our approach overwhelmingly outperforms previous works in terms of all evaluation metrics, achieving significant improvements for relation F1 by +2.40% on CoNLL04 and +1.90% on ACE05, respectively.
收起
摘要 :
Taking advantage of the rapid growth of community platforms, such as Yahoo Answers, Quora, etc., Community Question Answering (CQA) systems are developed to retrieve semantically equivalent questions when users raise a new query. ...
展开
Taking advantage of the rapid growth of community platforms, such as Yahoo Answers, Quora, etc., Community Question Answering (CQA) systems are developed to retrieve semantically equivalent questions when users raise a new query. A typical CQA system mainly consists of two key components, a retrieval model and a ranking model, to search for similar questions and select the most related, respectively. In this paper, we propose LARQ, Learning to Ask and Rewrite Questions, which is a novel sentence-level data augmentation method. Different from common lexical-level data augmentation progresses, we take advantage of the Question Generation (QG) model to obtain more accurate, diverse, and semantically-rich query examples. Since the queries differ greatly in a low-resource code-start scenario, incorporating the QG model as an augmentation to the indexed collection significantly improves the response rate of CQA systems. We incorporate LARQ in an online CQA system and the Bank Question (BQ) Corpus to evaluate the enhancements for both the retrieval process and the ranking model. Extensive experimental results show that the LARQ enhanced model significantly outperforms single BERT and XGBoost models, as well as a widely-used QG model (NQG).
收起
摘要 :
Taking advantage of the rapid growth of community platforms, such as Yahoo Answers, Quora, etc., Community Question Answering (CQA) systems are developed to retrieve semantically equivalent questions when users raise a new query. ...
展开
Taking advantage of the rapid growth of community platforms, such as Yahoo Answers, Quora, etc., Community Question Answering (CQA) systems are developed to retrieve semantically equivalent questions when users raise a new query. A typical CQA system mainly consists of two key components, a retrieval model and a ranking model, to search for similar questions and select the most related, respectively. In this paper, we propose LARQ, Learning to Ask and Rewrite Questions, which is a novel sentence-level data augmentation method. Different from common lexical-level data augmentation progresses, we take advantage of the Question Generation (QG) model to obtain more accurate, diverse, and semantically-rich query examples. Since the queries differ greatly in a low-resource code-start scenario, incorporating the QG model as an augmentation to the indexed collection significantly improves the response rate of CQA systems. We incorporate LARQ in an online CQA system and the Bank Question (BQ) Corpus to evaluate the enhancements for both the retrieval process and the ranking model. Extensive experimental results show that the LARQ enhanced model significantly outperforms single BERT and XGBoost models, as well as a widely-used QG model (NQG).
收起
摘要 :
Context-dependent Text-to-SQL aims to translate multi-turn natural language questions into SQL queries. Despite various methods have exploited context-dependence information implicitly for contextual SQL parsing, there are few att...
展开
Context-dependent Text-to-SQL aims to translate multi-turn natural language questions into SQL queries. Despite various methods have exploited context-dependence information implicitly for contextual SQL parsing, there are few attempts to explicitly address the dependencies between current question and question context. This paper presents QURG, a novel QUestion Rewriting Guided approach to help the models achieve adequate contextual understanding. Specifically, we first train a question rewriting model to complete the current question based on question context, and convert them into a rewriting edit matrix. We further design a two-stream matrix encoder to jointly model the rewriting relations between question and context, and the schema linking relations between natural language and structured schema. Experimental results show that QURG significantly improves the performances on two large-scale context-dependent datasets SParC and CoSQL, especially for hard and long-turn questions.
收起
摘要 :
As the core component of the aircraft, the avionics system has gone through the independent, federated, and integrated architecture to deal with the increasingly complex air combat scenarios. With the development of artificial int...
展开
As the core component of the aircraft, the avionics system has gone through the independent, federated, and integrated architecture to deal with the increasingly complex air combat scenarios. With the development of artificial intelligence and integrated circuits, avionics systems should be prepared to equip a large number of intelligent applications, which contributes to more effectively completing the military missions. In addition, the next generation of avionics systems need to be more flexible, and thus can be conveniently updated and allow intelligent applications effectively communicate with each other. Therefore, to achieve a more intelligent and flexible avionics systm, this paper proposes a novel Service-oriented Intelligent Avionics System Architecture, named as SIAS A. With such a service-oriented architecture, SIASA provides a flexible software platform by treating intelligent applications as services. Specifically, the proposed SIASA consists of seven layers, i.e., airborne intelligent chip layer, abstraction layer, hashrate abstraction layer, equipment abstraction layer, airborne service capability layer, airborne intelligent application atomic service layer, and airborne intelligent application layer. With these layers, SIASA well supports intelligent applications and is able to make effective and efficient updates. Furthermore, SIASA also supports more advanced scenarios, including airborne OTA, drag-and-drop software development, and system intelligence.
收起
摘要 :
Joint entity and relation extraction has received increasing interests recently, due to the capability of utilizing the interactions between both steps. Among existing studies, the Multi-Head Selection (MHS) framework is efficient...
展开
Joint entity and relation extraction has received increasing interests recently, due to the capability of utilizing the interactions between both steps. Among existing studies, the Multi-Head Selection (MHS) framework is efficient in extracting entities and relations simultaneously. However, the method is weak for its limited performance. In this paper, we propose several effective insights to address this problem. First, we propose an entity-specific Relative Position Representation (eRPR) to allow the model to fully leverage the distance information between entities and context tokens. Second, we introduce an auxiliary Global Relation Classification (GRC) to enhance the learning of local contextual features. Moreover, we improve the semantic representation by adopting a pre-trained language model BERT as the feature encoder. Finally, these new keypoints are closely integrated with the multi-head selection framework and optimized jointly. Extensive experiments on two benchmark datasets demonstrate that our approach overwhelmingly outperforms previous works in terms of all evaluation metrics, achieving significant improvements for relation Fl by +2.40% on CoNLL04 and +1.90% on ACE05, respectively.
收起